Automatic Segmentation of Parasitic Sounds in Speech Corpora for TTS Synthesis
نویسنده
چکیده
In this paper, automatic segmentation of parasitic speech sounds in speech corpora for text-to-speech (TTS) synthesis is presented. The automatic segmentation is, beside the automatic detection of the presence of such sounds in speech corpora, an important step in the precise localisation of parasitic sounds in speech corpora. The main goal of this study is to find out whether the segmentation of these sounds is accurate enough to enable cutting the sounds out of synthetic speech or explicit modelling of these sounds during synthesis. HMM-based classifier was employed to detect the parasitic sounds and to find the boundaries between these sounds and the surrounding phones simultaneously. The results show that the automatic segmentation of parasitic sounds is comparable to the segmentation of other phones, which indicates that the cutting out or the explicit usage of parasitic sounds should be possible.
منابع مشابه
Fully automatic segmentation for prosodic speech corpora
While automatic methods for phonetic segmentation of speech can help with rapid annotation of corpora, most methods rely either on manually segmented data to initially train the process or manual post-processing. This is very time-consuming and slows down porting of speech systems to new languages. In the context of prosody corpora for text-to-speech (TTS) systems, we investigated methods for f...
متن کاملRobust Automatic Continuous Speech Segmentation for Indian Languages to Improve Speech to Speech Translation
This paper provides an analysis of phrase and word boundary detection in a background of noise, which occurs in the context of Automatic Recognition System (ASR) and TextTo-Speech (TTS) synthesis systems for Indian languages. ASR and TTS are the major components in Speech To Speech Translation (STST) system. Both are always need a speech signal to be segmented into some basic units like phrases...
متن کاملAdapting Prosody in a Text-to-Speech System
The requirements of the evolving information communication technologies (ICT) place new demands on text-to-speech (TTS) systems. The modern high quality TTS system has to be capable of fast and high-quality adaptation to a new language, voice or even expressive speech. Thus adaptation to new voices with different prosodic characteristics is desired. In this chapter a survey of recent and past a...
متن کاملA fusion approach for automatic speech segmentation of large corpora with application to speech synthesis
This paper deals with the automatic segmentation of large speech corpora in the case when the phonetic sequence corresponding to the speech signal is known. A direct and typical application is corpus-based Text-To-Speech (TTS) synthesis. We start by proposing a general approach for combining several segmentations produced by different algorithms. Then, we describe and analyse three automatic se...
متن کاملTowards Linguistic Naturalness of Synthetic Speech
This paper presents another step towards linguistic naturalness of synthetic Czech. The main goal of this study is to avoid unintended occurrences of parasitic speech sounds (namely preglottalization) in synthesised speech. Firstly, we explain what we mean by the term parasitic speech sound. Secondly, procedures for both automatic detection and segmentation of these sounds in source speech reco...
متن کامل